Thuy-Tien Bui
  • Home
  • About
  • Projects
  • Publications

Assessing and forecasting Willamette Falls fish passage

R
Modeling

Investigation of time series data to understand and predict trends in fish passage

Author

Thuy-Tien Bui

Published

March 1, 2025

Introduction & Data Summary

Fish passage is a critical component of river ecosystem health, ensuring migratory species can access spawning and rearing habitats essential for their survival. In the Pacific Northwest, species like coho salmon (Oncorhynchus kisutch), jack coho (precocious male coho that return earlier than typical adults), and steelhead trout (Oncorhynchus mykiss) rely on unobstructed passage to complete their life cycles. At Willamette Falls in Oregon, one of the largest natural waterfalls in the region, fish passage is especially important due to the historical and ongoing impacts of barriers such as dams.

Fish ladder at Fall Creek Dam in Oregon (Photo by Kristyna Wentz-Graff/OPB)

Coho salmon swimming upstream (Photo by Chuck Haney)

The Willamette Falls Fish Ladder is an effort by the Oregon Department of Fish and Wildlife to support fish migration. It is composed of three individual fish ladders that merge into a single exit point above Willamette Falls. Biologists have visually counted observations of fish passing through the fishway since 2001. The dataset used in this analysis records counts for several fish species between 2001 and 2010 at the station.

Data Citation: Columbia River DART. DART Adult Passage Counts (Willamette Falls, 2001-2010). https://www.cbr.washington.edu/dart/query/adult_graph_text. Accessed January 25, 2023.

Code
library(tidyverse)
library(lubridate)
library(ggplot2)
library(here)
library(tsibble)
library(feasts)
library(fable)
library(paletteer)
library(janitor)
library(patchwork)
library(scales)

Purpose & Methods

This analysis aims to assess and forecast fish passage time series data collected at the Willamette Falls fish ladder on the Willamette River in Oregon. The analysis focuses on three select focal species: Coho salmon, Jack coho salmon, and Steelhead trout. The following procedure outlines the methods of this analysis:

  1. Convert data frame into a time series data frame.

  2. Plot the time series to assess general temporal trends in fish passage for each species.

  3. Create a seasonplot to assess fish passage trends throughout years across the study period for each species.

  4. Aggregate fish counts by year and plot annual fish counts for each species to understand annual fish passage trends.

  5. Forecast salmon runs using Holt-Winters exponential smoothing.

Code
# Load data, replace NA with 0, and convert to tsibble
fish_ts <- read_csv(here("data", "willamette_fish_passage.csv")) %>% 
  clean_names() %>% 
  mutate_all(~replace(., is.na(.), 0)) %>% 
  select(date, coho, jack_coho, steelhead) %>% 
  mutate(date = lubridate::mdy(date)) %>%
  as_tsibble(key = NULL, 
             index = date)
  

# Pivot data longer
fish_long <- fish_ts %>% 
  pivot_longer(-date, 
               names_to = "species", 
               values_to = "count")

Results

  • Time Series
  • Seasonplot
  • Annual Counts
  • Forecast
Code
# Plot time series and facet wrap by species
fish_long %>% 
  ggplot(aes(x = date, y = count, color = species)) +
  geom_line() +
  scale_color_manual(values = c('#BA7999FF', '#59629BFF', '#015B58FF')) +
  facet_wrap(~species, 
             nrow = 3, 
             labeller = labeller(species = 
                                   c("coho" = "Coho",
                                     "jack_coho" = "Jack Coho",
                                     "steelhead" = "Steelhead")),
             scales = "free_y") +
  labs(title = "Adult Fish Passage at Willamette Falls 2001-2010",
       x = "Date",
       y = "Count of Fish") +
  scale_y_continuous(labels = scales::comma) +
  theme_minimal() +
  theme(legend.position = "none")

Figure 1. Fish Passage Counts Time Series. Counts of fish passing through the Willamette Falls fish ladder from 2001-2010 by species.
  • All three fish species pass through Willamette Falls at relatively regular intervals annually.

  • Peak Coho and Jack coho passage align at the same time, within fairly narrow windows.

  • Steelhead passage occurs over a wider time window, with variable, less-defined peaks compared to the sharp peaks observed in Coho and Jack Coho counts.

Code
# Season plot
fish_long %>%
  gg_season(y = count, pal = paletteer_d("PNWColors::Starfish")) +
  facet_wrap(~species, 
             nrow = 3, 
             labeller = labeller(species = 
                                   c("coho" = "Coho",
                                     "jack_coho" = "Jack Coho",
                                     "steelhead" = "Steelhead")),
             scales = "free_y") +
  labs(x = "Date",
       y = "Count of Fish",
       title="Seasonal Trends in Adult Fish Passage at Willamette Falls") +
  scale_y_continuous(labels = scales::comma) +
    theme_minimal()

Figure 2. Fish Passage Counts Seasonplot. Counts of fish passage by species throughout the year from 2001-2010. Counts in earlier years are green and blue and more recent years are in purple and pink.
  • Peak passage times for Coho and Jack Coho occur annually during the fall. There are lower and narrower peaks in earlier years compared to more recent years.

  • Steelhead passage follows a different seasonal pattern, with variable peaks from winter through mid-summer, with counts leveling off during the late summer and fall. Higher passage counts occurred during earlier years, with more recent years observing less steelhead at Willamette Falls.

Code
# Group data by fish species and year
fish_ts_year <- fish_long %>% 
  index_by(year = ~year(.)) %>% 
  group_by(species, year) %>%
  summarise(annual_count = sum(count))

# Plot annual fish passage by species
fish_ts_year %>% 
  ggplot(aes(x = year, y = annual_count, color = species)) +
  geom_line() +
  geom_point() +
  scale_color_manual(values = c('#BA7999FF', '#59629BFF', '#015B58FF')) +
  facet_wrap(~species, 
             nrow = 3, 
             labeller = labeller(species = 
                                   c("coho" = "Coho",
                                     "jack_coho" = "Jack Coho",
                                     "steelhead" = "Steelhead")),
             scales = "free_y") +
  labs(title = "Annual Counts of Adult Fish Passage at Willamette Falls 2001-2010",
       x = "Year",
       y = "Count of Fish") +
  scale_x_continuous(breaks = scales::pretty_breaks(n = 10)) +
  scale_y_continuous(labels = scales::comma) +
  theme_minimal() +
  theme(legend.position = "none")

Figure 3. Annual Fish Counts 2001-2010. Counts of fish passage by species and year.
  • Coho numbers fluctuate over the years with a notable significant increase between 2008 and 2009.

  • Jack coho numbers also fluctuate, with a declining trend from 2002-2005. Jack Coho numbers saw a significant increase between 2007 and 2008 before the observed increase in Coho the following year. However, it is unclear from the data alone whether Coho and Jack coho species influence one another’s population dynamics.

  • Steelhead numbers show a general decline throughout the study period. However, counts of steelhead increase in 2009, which may be an indicator of population recovery efforts for the endangered species. Still, continued monitoring over time is necessary to draw meaningful conclusions.

Code
# Get data into year and month form
fish_month <- fish_long %>% 
  index_by(yr_mo = ~yearmonth(.)) %>%  
  group_by(species, yr_mo) %>% 
  summarize(count = round(sum(count, na.rm = TRUE)))  %>%  
  ungroup()

# Create forecast model for each species
steelhead_fit <- fish_month %>%
  filter(species == 'steelhead') %>%
  model(
    ets = ETS(count ~  trend(method = "A") + season(method = "A"))
  )

coho_fit <- fish_month %>%
  filter(species == 'coho') %>%
  model(
    ets = ETS(count ~  trend(method = "A") + season(method = "A"))
  )
  
jack_fit <- fish_month %>%
  filter(species == 'jack_coho') %>%
  model(
    ets = ETS(count ~  trend(method = "A") + season(method = "A"))
  )

# Forecast using the model 5 years into the future:
steelhead_forecast <- steelhead_fit %>% 
  forecast(h = "5 years", level = c(80, 95)) %>% 
  hilo(level = c(80, 95)) %>% 
 mutate(
    lower_80 = str_split(`80%`, ",") %>%
      sapply(function(x) as.numeric(str_trim(str_replace(x[1], "\\[|\\]", "")))), 
    upper_80 = str_split(`80%`, ",") %>%
      sapply(function(x) as.numeric(str_trim(str_replace(x[2], "\\]", "")))),
    lower_95 = str_split(`95%`, ",") %>%
      sapply(function(x) as.numeric(str_trim(str_replace(x[1], "\\[|\\]", "")))), 
    upper_95 = str_split(`95%`, ",") %>%
      sapply(function(x) as.numeric(str_trim(str_replace(x[2], "\\]", "")))),
  ) %>% 
  clean_names() %>% 
  select(-x80_percent, -x95_percent) %>% 
  rename(
    count = mean,
    old_count = count    
  ) %>%
  mutate(count = round(count, 0))

coho_forecast <- coho_fit %>% 
  forecast(h = "5 years", level = c(80, 95)) %>% 
  hilo(level = c(80, 95)) %>% 
  mutate(
    lower_80 = str_split(`80%`, ",") %>%
      sapply(function(x) as.numeric(str_trim(str_replace(x[1], "\\[|\\]", "")))), 
    upper_80 = str_split(`80%`, ",") %>%
      sapply(function(x) as.numeric(str_trim(str_replace(x[2], "\\]", "")))),
    lower_95 = str_split(`95%`, ",") %>%
      sapply(function(x) as.numeric(str_trim(str_replace(x[1], "\\[|\\]", "")))), 
    upper_95 = str_split(`95%`, ",") %>%
      sapply(function(x) as.numeric(str_trim(str_replace(x[2], "\\]", "")))),
  ) %>% 
  clean_names() %>% 
  select(-x80_percent, -x95_percent) %>% 
  rename(
    count = mean,
    old_count = count    
  ) %>%
  mutate(count = round(count, 0))

jack_forecast <- jack_fit %>% 
  forecast(h = "5 years", level = c(80, 95)) %>% 
  hilo(level = c(80, 95)) %>% 
  mutate(
    lower_80 = str_split(`80%`, ",") %>%
      sapply(function(x) as.numeric(str_trim(str_replace(x[1], "\\[|\\]", "")))), 
    upper_80 = str_split(`80%`, ",") %>%
      sapply(function(x) as.numeric(str_trim(str_replace(x[2], "\\]", "")))),
    lower_95 = str_split(`95%`, ",") %>%
      sapply(function(x) as.numeric(str_trim(str_replace(x[1], "\\[|\\]", "")))), 
    upper_95 = str_split(`95%`, ",") %>%
      sapply(function(x) as.numeric(str_trim(str_replace(x[2], "\\]", "")))),
  ) %>% 
  clean_names() %>% 
  select(-x80_percent, -x95_percent) %>% 
  rename(
    count = mean,
    old_count = count    
  ) %>%
  mutate(count = round(count, 0))

# Combine dataframes for plotting
steelhead_forecast_df <- bind_rows(
  fish_month %>% mutate(source = "Observed"),
  steelhead_forecast %>% mutate(source = "Forecasted")
)

coho_forecast_df <- bind_rows(
  fish_month %>% mutate(source = "Observed"),
  coho_forecast %>% mutate(source = "Forecasted")
)

jack_forecast_df <- bind_rows(
  fish_month %>% mutate(source = "Observed"),
  jack_forecast %>% mutate(source = "Forecasted")
)

# Plot for coho
p1 <- ggplot(coho_forecast_df, aes(x = yr_mo, y = count, color = source)) +
  geom_ribbon(aes(ymin = lower_95, ymax = upper_95, fill = "95% CI"), 
              alpha = 1, color = NA) +  
  geom_ribbon(aes(ymin = lower_80, ymax = upper_80, fill = "80% CI"), 
              alpha = 1, color = NA) +
  geom_line() + 
  labs(title = "Coho") +
  scale_color_manual(values = c("Observed" = "black", "Forecasted" = "#BA7999FF")) +
  scale_fill_manual(values = c("80% CI" = "#d2a4ba", "95% CI" = "#e9d1dc")) +
  theme_minimal() +
  theme(legend.title = element_blank(),
    axis.title.x = element_blank(), 
    axis.title.y = element_blank() )

# Plot for jack coho
p2 <- ggplot(jack_forecast_df, aes(x = yr_mo, y = count, color = source)) +
  geom_ribbon(aes(ymin = lower_95, ymax = upper_95, fill = "95% CI"), 
              alpha = 1, color = NA) +  
  geom_ribbon(aes(ymin = lower_80, ymax = upper_80, fill = "80% CI"), 
              alpha = 1, color = NA) +
  geom_line() + 
  labs(title = "Jack Coho", y = "Count") +
  scale_color_manual(values = c("Observed" = "black", "Forecasted" = "#59629BFF")) +
  scale_fill_manual(values = c("80% CI" = "#8c94bd", "95% CI" = "#c4c9de")) +
  theme_minimal() +
  theme(legend.title = element_blank(),
        axis.title.x = element_blank(),)

# Plot for steelhead
p3 <- ggplot(steelhead_forecast_df, aes(x = yr_mo, y = count, color = source)) +
  geom_ribbon(aes(ymin = lower_95, ymax = upper_95, fill = "95% CI"), 
              alpha = 1, color = NA) +  
  geom_ribbon(aes(ymin = lower_80, ymax = upper_80, fill = "80% CI"), 
              alpha = 1, color = NA) +
  geom_line() + 
  labs(title = "Steelhead", x = "Date") +
  scale_color_manual(values = c("Observed" = "black", "Forecasted" = "#015B58FF")) +
  scale_fill_manual(values = c("80% CI" = "#7ca19e", "95% CI" = "#afc6c4")) +
  theme_minimal() +
  theme(legend.title = element_blank(),
        axis.title.y = element_blank(),)

# Combine the plots vertically and set the layout
forecast_plot <- p1 / p2 / p3 + 
  plot_layout(guides = "collect") +
  plot_annotation(title = "Five-year fish count forecast")

# Print the combined plot
print(forecast_plot)

  • The forecast model predicts similar overall patterns over the next 5 years in this data time frame. However, the magnitude and granularity of the predicted counts is simplified.

  • In the case of Jack Coho, forecasted fish counts did not resemble observed trends, with predicted counts significantly decreasing over the next 5 years.

  • Given these results, a Holt-Winters forecast method is not very useful in predicting fish counts. This is due to variable trends within the data and inability of the model to account for complex ecological dynamics that influence fish count.